OCR Error Rate Versus Rejection Rate for Isolated Handprint Characters

نویسندگان

  • Jon Geist
  • R. Allen Wilkinson
چکیده

A 1500-CHARACTER OR LESS FACTUAL SUMMARY OF MOST SIGNIFICANT INFORMATION. IF DOCUMENT INCLUDES A SIGNIFICANT BIBUOGRAPHY OR UTERATURE SURVEY. CITE IT HERE. SPELL OUT ACRONYMS ON FIRST REFERENCE.) (CONTINUE ON SEPARATE PAGE, IF NECESSARY.) ~; f}'I.a/yL+'-<-~1.Jl'tA (n~W f...< oQ.....~1 &lt~<-eCi.Lt-~) Over twenty-five~icipating in the First Census OCR SystemsConference submitted confidence data as well as character classification data for the digit test in that Conference. A three parameter function of the rejection rate r was fit to the error rate versus rejection rate data derived from this data, and found to fit it very well over the range from r = 0 to r = 0.15. The probability distribution underlying the model e(r) curve was derived and shown to correspond to an inherently inefficient rejection process. With only a few exceptions that seem to be insignificant, all of th s ste submitting data to the Conference for scoring seem to employ this same rejection process with a remarkable uniformity of efficiency with respect to the maximum efficiency allowed for this process. Human classification of a subset of the digit test suggests that there is considerable room for improvement in the performance of machin~ OCR before the theoretical ideal is achieved. ~ .iA-~J.:.rv} error rate; isolated character; hand print; OCR; segmente~ character. AVAILABIUTY UNUMITED 0 FOR OFFICIAL DISTRIBUTION. DO NOT RELEASE TO NTIS. ORDER FROM SUPERINTENDENT OF DOCUMENTS, U.S. GPO, WASHINGTON, D.C. 20402 ORDER FROM NTIS, SPRINGFIELD. VA 22161 NOTE TO AUTHOR(S) IF YOU DO NOT WISH THIS MANUSCRIPT ANNOUNCED BEFORE PUBUCATION. PLEASE CHECK HERE. 0

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

OCR - Optical Character Recognition

Character recognition techniques associate a symbolic identity with the image of character. Character recognition is commonly referred to as optical character recognition (OCR), as it deals with the recognition of optically processed characters. The modern version of OCR appeared in the middle of the 1940's with the development of the digital computers. OCR machines have been commercially avail...

متن کامل

Kodak ImagelinkTM OCR Alphanumeric Handprint Module

This paper describes the Kodak Imageliok TM OCR alphanumeric handprint module. There are two neural network algorithms at its cme: the first network is trained to find individual characters in an alphamuneric field, while the second one perfmns the classification. Both networks were trained on Gabor projections of the ociginal pixel images, which resulted in higher recognition rates and greater...

متن کامل

Rejection Threshold Estimation for an Unknown Language Model in an OCR Task

In an OCR post-processing task, a language model is used to find the best transformation of the OCR hypothesis into a string compatible with the language. The cost of this transformation is used as a confidence value to reject the strings that are less likely to be correct, and the error rate of the accepted strings should be strictly controlled by the user. In this work, the expected error rat...

متن کامل

Techniques for Highly Accurate Optical Recognition of Handwritten Characters and Their Application to Sixth Chinese National Population Census

Highly accurate optical character recognition (OCR) of handwritten characters is still a challenging task, especially for languages like Chinese and Japanese. To improve the accuracy, we developed four techniques for enhanced recognition: character recognition based on modified linear discriminant analysis (MLDA), subspace-based similar-character discrimination, multi-classifier combination, an...

متن کامل

Interaction for style-constrained OCR

The error rate can be considerably reduced on a style-consistent document if its style is identified and the right style-specific classifier is used. Since in some applications both machines and humans have difficulty in identifying the style, we propose a strategy to improve the accuracy of style-constrained classification by enlisting the human operator to identify the labels of some characte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010